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Anecdotal evidence shows promot- 
ers being reused separate from 
their downstream gene, thus providing 
a mechanism for the efficient and rapid 
rewiring of a gene's transcriptional reg- 
ulation. We have identified over 4,000 
groups of highly similar promoters 
using a conservative sequence similarity 
search in all fully sequenced prokaryotic 
genomes. About 6% of those groups are 
shared between bacteria from different 
taxonomic depth, including different 
genera, families, orders, classes and even 
phyla. Database searches against known 
mobile elements and RNA motifs have 
indicated that regulatory motifs such as 
riboswitches could be moved around on 
putative mobile promoters. 

Reuse of protein coding DNA sequences 
through gene duplication and horizon- 
tal gene transfer is a well-known and 
profound innovative force in nature;' in 
sharp contrast to this, the mobility of a 
gene's transcription regulatory function 
encapsulated in its promoter region is 
much less known. There are a few well 
studied classes of mobile genetic elements 
that harbour functional promoters, like 
Correia elements,^ ERICs^ and REPIN.* 
But also examples of duplicated promot- 
ers not associated with known mobile ele- 
ments'''' suggest that promoter reuse could 
represent a rampant and rapid mechanism 
of gene rewiring. In a recent publication 
Blount et al.^ identified a promoter capture 
event as a crucial step in the evolution of 
aerobic citrate utilization by a population 
of Escherichia coli in a long-term evolution 
experiment, and speculate that promoter 
capture may be an important and little 



appreciated adaptive force in genome evo- 
lution. Similarly, Bongers et al.** described 
the activation of a silent lactate dehydro- 
genase gene by promoter recruitment in 
Lactococcus lactis. In both studies insertion 
sequences (IS) were involved in promoter 
mobility, though Blount et al. also found 
cases that were not associated with IS ele- 
ments. In order to estimate the relevance 
of promoter recruitment in genome evo- 
lution we made a conservative inventory 
of such events in prokaryotes, which 
was recently published in Nucleic Acids 
Research? 

Tip of the Iceberg 

To assess the extent of promoter reuse in 
bacteria we looked for groups of bacterial 
genes per genome that share highly simi- 
lar sequences upstream of their transcrip- 
tional start site, but do not have obvious 
flanking paralogous coding sequences. 
More specifically, we extracted in silico the 
DNA region between positions -150 and 
-50 relative to the start of translation for all 
genes in a genome (including plasmids), 
except when this overlapped with the cod- 
ing region or promoter region of a flank- 
ing gene. In Escherichia coli the majority of 
the transcriptional start sites were shown 
to be between 20 and 40 nucleotides 
upstream of the translational start site,'" 
so most of our upstream sequence frag- 
ments will not contain the important -10 
(Pribnow) box, but should include the -35 
element. Using BLAST" we then searched 
for sequence pairs that matched with 80% 
or more nucleotide identity over at least 
50 nucleotides, to select for highly simi- 
lar regions rather than for short conserved 
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Table 1. Number of PMPs shared by bacterial 
genomes from different genera, families, 
orders, etc. 



Branch point 


Count 


Genera 


28 


Families 


12 


Orders 9 


Class 


9 


Phylum 


4 


Domain 


0 


62 



DNA elements. Sequences with more than 
one hit in the database were clustered into 
families. Sequence pairs that in addition 
showed a high nucleotide identity in their 
adjacent coding sequences were assumed 
to be paralogs and excluded because for 
this study we were interested in the inde- 
pendent mobility of promoters, not dupli- 
cated regions (for details see the Materials 
and Methods in our Nucleic Acids Research 
paper'). 

We analyzed all available complete 
prokaryotic genomes (1,362; July 2011) 
and even with our strict selection crite- 
ria found over 4,000 families of highly 
similar sequences upstream of apparently 
unrelated coding sequences. The majority 
of these families actually consist of pairs 
that on average share 92% nucleotide 
identity, meaning that at least 46 out of 
50 base pairs were conserved, but we also 
found pairs that were completely identi- 
cal over 100 base pairs. Whether this level 
of high identity is the result of a strong 
selective pressure, or indicative of recent 
duplication events remains to be inves- 
tigated. We termed these homologous 
non-coding sequences Putative Mobile 
Promoters, PMPs. In fact, some of these 
sequences likely are not promoters but 
have a different function that causes their 
conservation. Looking for known ele- 
ments in our PMP set we actually found 
42 tRNAs, 83 resembled other RNA fam- 
ilies like the regulatory riboswitches, and 
interestingly 210 were known insertion 
sequences.'^ The > 4,000 families that our 
study uncovered represent only a small 
sample of a large pool of repeated DNA 
in promoter regions, a conservative refer- 
ence of promoter reuse in prokaryotes. We 
anticipate many relevant examples of the 
phenomenon remain undetected because 



of our strict criteria. For example, filtering 
out paralogous genes also removes mobile 
promoters that extend into the coding 
region, like reported cases of Correia 
elements that overlap with an ORF.^ In 
addition, our initial extraction of pro- 
moter sequences is sensitive to wrongly 
annotated translational start sites, which 
is a known issue with genome annotation 
pipelines.'' 

Horizontal Promoter Transfer? 

More surprising even than the large num- 
ber of promoter pairs sharing high nucleo- 
tide identity within one bacterial genome 
is that about 6% are shared between dis- 
tantly related species. Clustering these 
based on sequence similarity resulted 
in 62 distinct groups, of which four are 
present in species that are related only by 
belonging to the same phylum (Table 1). 
As expected, inter-taxon transfers seem to 
decrease with phylogenetic distance and 
at the domain level, i.e., between Archaea 
and Bacteria, no transfer events were 
observed. Some non-coding sequence 
elements like tRNAs are very well con- 
served over large evolutionary distances,''* 
but if highly similar sequences are found 
only in small number of distantly related 
species horizontal gene transfer is a more 
likely scenario. The large majority of the 
PMPs are located on a chromosome, but 
for one group of PMPs all members are 
in fact on plasmids. These plasmids are 
associated with multiple-drug resistance 
in pathogenic Salmonella" and are fre- 
quently transferred between bacterial 
species. 

Although the genetic code for trans- 
lating DNA to protein is extremely well 
conserved between species as distant as 
Escherichia coli and Homo sapiens, tran- 
scriptional cis-regulatory elements are 
much more variable"" and their activ- 
ity can differ even between strains of 
the same species.'^'"* It can therefore be 
expected that the 62 homologous PMPs 
are not primarily transcription fac- 
tor binding sites, but rather have other 
(regulatory) functions causing their high 
conservation. Indeed, two of the PMPs 
that are shared between families of bac- 
teria are known S-adenosylmethionine 
(SAM) binding riboswitches." The other 



60 PMPs however did not resemble any of 
the RNA families included in the Rfam 
database,^" so their function at present 
remains uncovered. 

We conclude that we have uncovered a 
large number of putative mobile promoter 
families, present in numerous bacterial 
genomes. These may be involved in rapid 
adaptive processes via transcriptional 
rewiring, or include post-transcriptional 
regulatory functions. The ways these 
PMPs move within and between genomes 
is still unknown, but due to the large num- 
ber of families, this may include diverse 
mobilization mechanisms. 

Finally, although transcription regula- 
tion in eukaryotes is more complex than 
in bacteria, it seems obvious that also in 
eukaryotes promoter reuse offers a mecha- 
nism for rapid adaptation of gene expres- 
sion. It would therefore be very interesting 
to extent our analysis to this domain, 
especially now more genomes and tran- 
scriptomes are becoming available that 
greatly facilitate the mapping of the core 
promoters. 
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